130 research outputs found

    Object tracking and detection after occlusion via numerical hybrid local and global mode-seeking

    Get PDF
    Given an object model and a black-box measure of similarity between the model and candidate targets, we consider visual object tracking as a numerical optimization problem. During normal tracking conditions when the object is visible from frame to frame, local optimization is used to track the local mode of the similarity measure in a parameter space of translation, rotation and scale. However, when the object becomes partially or totally occluded, such local tracking is prone to failure, especially when common prediction techniques like the Kalman filter do not provide a good estimate of object parameters in future frames. To recover from these inevitable tracking failures, we consider object detection as a global optimization problem and solve it via Adaptive Simulated Annealing (ASA), a method that avoids becoming trapped at local modes and is much faster than exhaustive search. As a Monte Carlo approach, ASA stochastically samples the parameter space, in contrast to local deterministic search. We apply cluster analysis on the sampled parameter space to redetect the object and renew the local tracker. Our numerical hybrid local and global mode-seeking tracker is validated on challenging airborne videos with heavy occlusion and large camera motions. Our approach outperforms state-of-the-art trackers on the VIVID benchmark datasets. 1

    A Deep Learning Framework for Automated Vesicle Fusion Detection

    Get PDF
    Quantitative analysis of vesicle-plasma membrane fusion events in the fluorescence microscopy, has been proven to be important in the vesicle exocytosis study. In this paper, we present a framework to automatically detect fusion events. First, an iterative searching algorithm is developed to extract image patch sequences containing potential events. Then, we propose an event image to integrate the critical image patches of a candidate event into a single-image joint representation as the input to Convolutional Neural Networks (CNNs). According to the duration of candidate events, we design three CNN architectures to automatically learn features for the fusion event classification. Compared on 9 challenging datasets, our proposed method showed very competitive performance and outperformed two state-of-the-arts

    An Attention-guided Multistream Feature Fusion Network for Localization of Risky Objects in Driving Videos

    Full text link
    Detecting dangerous traffic agents in videos captured by vehicle-mounted dashboard cameras (dashcams) is essential to facilitate safe navigation in a complex environment. Accident-related videos are just a minor portion of the driving video big data, and the transient pre-accident processes are highly dynamic and complex. Besides, risky and non-risky traffic agents can be similar in their appearance. These make risky object localization in the driving video particularly challenging. To this end, this paper proposes an attention-guided multistream feature fusion network (AM-Net) to localize dangerous traffic agents from dashcam videos. Two Gated Recurrent Unit (GRU) networks use object bounding box and optical flow features extracted from consecutive video frames to capture spatio-temporal cues for distinguishing dangerous traffic agents. An attention module coupled with the GRUs learns to attend to the traffic agents relevant to an accident. Fusing the two streams of features, AM-Net predicts the riskiness scores of traffic agents in the video. In supporting this study, the paper also introduces a benchmark dataset called Risky Object Localization (ROL). The dataset contains spatial, temporal, and categorical annotations with the accident, object, and scene-level attributes. The proposed AM-Net achieves a promising performance of 85.73% AUC on the ROL dataset. Meanwhile, the AM-Net outperforms current state-of-the-art for video anomaly detection by 6.3% AUC on the DoTA dataset. A thorough ablation study further reveals AM-Net's merits by evaluating the contributions of its different components.Comment: Submitted to IEEE-T-IT

    Image Data Analytics to Support Engineersā€™ Decision-Making

    Get PDF
    Robots such as drones have been leveraged to perform structure health inspection such as bridge inspection. Big data of inspection videos can be collected by cameras mounted on drones. In this project, we develop image analysis algorithms to support bridge engineers to analyze the big video data. Bridge engineers define the region of interest initially, then the algorithm retrieves all related regions in the video, which facilitates the engineers to inspect the bridge rather than exhaustively check every frame of the video. To perform this task, we propose a Multi-scale Siamese Neural Network. The network is initially trained by one-shot learning and is fine-tuned iteratively with human in the loop. Our neural network is evaluated on three bridge inspection videos with promising performances

    Estimating Freeway Travel Times using the General Motors Model

    Get PDF
    Travel time is a key transportation performance measure because of its diverse applications. Various modeling approaches to estimating freeway travel time have been well developed due to widespread installation of intelligent transportation system sensors. However, estimating accurate travel time using existing freeway travel time models is still challenging under congested conditions. Therefore, this study aimed to develop an innovative freeway travel time estimation model based on the General Motors (GM) car-following model. Since the GM model is usually used in a microsimulation environment, the concepts of virtual leading and virtual following vehicles are proposed to allow the GM model to be used in macroscale environments using aggregated traffic sensor data. Travel time data collected from three study corridors on I-270 in Saint Louis, Missouri, were used to verify the estimated travel times produced by the proposed General Motors travel time estimation (GMTTE) model and two existing models, the instantaneous model and the time-slice model. The results showed that the GMTTE model out-performed the two existing models due to lower mean average percentage errors of 1.62% in free-flow conditions and 6.66% in two congested conditions. Overall, the GMTTE model demonstrated its robustness and accuracy for estimating freeway travel times

    Fine-grained Activity Classification In Assembly Based On Multi-visual Modalities

    Get PDF
    Assembly activity recognition and prediction help to improve productivity, quality control, and safety measures in smart factories. This study aims to sense, recognize, and predict a worker\u27s continuous fine-grained assembly activities in a manufacturing platform. We propose a two-stage network for workers\u27 fine-grained activity classification by leveraging scene-level and temporal-level activity features. The first stage is a feature awareness block that extracts scene-level features from multi-visual modalities, including red, green blue (RGB) and hand skeleton frames. We use the transfer learning method in the first stage and compare three different pre-trained feature extraction models. Then, we transmit the feature information from the first stage to the second stage to learn the temporal-level features of activities. The second stage consists of the Recurrent Neural Network (RNN) layers and a final classifier. We compare the performance of two different RNNs in the second stage, including the Long Short-Term Memory (LSTM) and the Gated Recurrent Unit (GRU). The partial video observation method is used in the prediction of fine-grained activities. In the experiments using the trimmed activity videos, our model achieves an accuracy of \u3e 99% on our dataset and \u3e 98% on the public dataset UCF 101, outperforming the state-of-the-art models. The prediction model achieves an accuracy of \u3e 97% in predicting activity labels using 50% of the onset activity video information. In the experiments using an untrimmed video with continuous assembly activities, we combine our recognition and prediction models and achieve an accuracy of \u3e 91% in real time, surpassing the state-of-the-art models for the recognition of continuous assembly activities

    Heterogeneous Activity Causes a Nonlinear Increase in the Group Energy Use of Ant Workers Isolated from Queen and Brood

    Get PDF
    Increasing evidence has shown that the energy use of ant colonies increases sublinearly with colony size so that large colonies consume less per capita energy than small colonies. It has been postulated that social environment (e.g., in the presence of queen and brood) is critical for the sublinear group energetics, and a few studies of ant workers isolated from queens and brood observed linear relationships between group energetics and size. In this paper, we hypothesize that the sublinear energetics arise from the heterogeneity of activity in ant groups, that is, large groups have relatively more inactive members than small groups. We further hypothesize that the energy use of ant worker groups that are allowed to move freely increases more slowly than the group size even if they are isolated from queen and brood. Previous studies only provided indirect evidence for these hypotheses due to technical difficulties. In this study, we applied the automated behavioral monitoring and respirometry simultaneously on isolated worker groups for long time periods, and analyzed the image with the stateā€ofā€theā€art algorithms. Our results show that when activity was not confined, large groups had lower per capita energy use, a lower percentage of active members, and lower average walking speed than small groups; while locomotion was confined, however, the per capita energy use was a constant regardless of the group size. The quantitative analysis shows a direct link between variation in group energy use and the activity level of ant workers when isolated from queen and brood
    • ā€¦
    corecore